The Sample Complexity of Dictionary Learning

نویسندگان

  • Daniel Vainsencher
  • Shie Mannor
  • Alfred M. Bruckstein
چکیده

A large set of signals can sometimes be described sparsely using a dictionary, that is, every element can be represented as a linear combination of few elements from the dictionary. Algorithms for various signal processing applications, including classification, denoising and signal separation, learn a dictionary from a given set of signals to be represented. Can we expect that the error in representing by such a dictionary a previously unseen signal from the same source will be of similar magnitude as those for the given examples? We assume signals are generated from a fixed distribution, and study these questions from a statistical learning theory perspective. We develop generalization bounds on the quality of the learned dictionary for two types of constraints on the coefficient selection, as measured by the expected L2 error in representation when the dictionary is used. For the case of l1 regularized coefficient selection we provide a generalization bound of the order of O (√ np ln(mλ)/m ) , where n is the dimension, p is the number of elements in the dictionary, λ is a bound on the l1 norm of the coefficient vector and m is the number of samples, which complements existing results. For the case of representing a new signal as a combination of at most k dictionary elements, we provide a bound of the order O( √ np ln(mk)/m) under an assumption on the closeness to orthogonality of the dictionary (low Babel function). We further show that this assumption holds for most dictionaries in high dimensions in a strong probabilistic sense. Our results also include bounds that converge as 1/m, not previously known for this problem. We provide similar results in a general setting using kernels with weak smoothness requirements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Minimax Lower Bounds for Dictionary Learning from Tensor Data

This paper provides lower bounds on the sample complexity of estimating Kronecker-structured dictionaries for Kth-order tensor data. The results suggest the sample complexity of dictionary learning for tensor data can be significantly lower than that for unstructured data.

متن کامل

Speech Enhancement using Adaptive Data-Based Dictionary Learning

In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...

متن کامل

A Novel Face Detection Method Based on Over-complete Incoherent Dictionary Learning

In this paper, face detection problem is considered using the concepts of compressive sensing technique. This technique includes dictionary learning procedure and sparse coding method to represent the structural content of input images. In the proposed method, dictionaries are learned in such a way that the trained models have the least degree of coherence to each other. The novelty of the prop...

متن کامل

On the Sample Complexity of Predictive Sparse Coding

Predictive sparse coding algorithms recently have demonstrated impressive performance on a variety of supervised tasks, but they lack a learning theoretic analysis. We establish the first generalization bounds for predictive sparse coding. In the overcomplete dictionary learning setting, where the dictionary size k exceeds the dimensionality d of the data, we present an estimation error bound t...

متن کامل

A Novel Image Denoising Method Based on Incoherent Dictionary Learning and Domain Adaptation Technique

In this paper, a new method for image denoising based on incoherent dictionary learning and domain transfer technique is proposed. The idea of using sparse representation concept is one of the most interesting areas for researchers. The goal of sparse coding is to approximately model the input data as a weighted linear combination of a small number of basis vectors. Two characteristics should b...

متن کامل

Alternating minimization for dictionary learning with random initialization

We present theoretical guarantees for an alternating minimization algorithm for the dictionary learning/sparse coding problem. The dictionary learning problem is to factorize vector samples y, y, . . . , y into an appropriate basis (dictionary) A∗ and sparse vectors x1∗, . . . , xn∗. Our algorithm is a simple alternating minimization procedure that switches between l1 minimization and gradient ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2011